Discriminant analysis using the unweighted sum of binary variables: a comparison of model selection methods

Discriminant Analysis can best be defined as a technique which allows the classification of an individual into several dictinctive populations on the basis of a set of measurements. Stepwise discriminant analysis (SDA) is concerned with selecting the most important variables whilst retaining the highest discrimination power possible. The process of selecting a smaller number of variables is often necessary for a variety number of reasons. In the existing statistical software packages SDA is based on the classic feature selection methods. Many problems with such stepwise procedures have been identified. In this work the new method based on the metaheuristic strategy tabu search will be presented together with the experimental results conducted on the selected benchmark datasets. The results are promising.

Download Full-text

Bayesian Model Averaging to Account for Model Uncertainty in Estimates of a Vaccine's Effectiveness

10.1101/2021.05.12.21257126 ◽

2021 ◽

Author(s):

Carlos R Oliveira ◽

Eugene D Shapiro ◽

Daniel M Weinberger

Keyword(s):

Model Selection ◽

Model Uncertainty ◽

Bayesian Model ◽

Bayesian Model Averaging ◽

Model Averaging ◽

Selection Methods ◽

Final Model ◽

Negative Case ◽

Confounder Selection ◽

Control Study

Vaccine effectiveness (VE) studies are often conducted after the introduction of new vaccines to ensure they provide protection in real-world settings. Although susceptible to confounding, the test-negative case-control study design is the most efficient method to assess VE post-licensure. Control of confounding is often needed during the analyses, which is most efficiently done through multivariable modeling. When a large number of potential confounders are being considered, it can be challenging to know which variables need to be included in the final model. This paper highlights the importance of considering model uncertainty by re-analyzing a Lyme VE study using several confounder selection methods. We propose an intuitive Bayesian Model Averaging (BMA) framework for this task and compare the performance of BMA to that of traditional single-best-model-selection methods. We demonstrate how BMA can be advantageous in situations when there is uncertainty about model selection by systematically considering alternative models and increasing transparency.

Download Full-text

An Optimal Categorization of Feature Selection Methods for Knowledge Discovery

Data Mining ◽

10.4018/978-1-4666-2455-9.ch005 ◽

2013 ◽

pp. 92-106

Author(s):

Harleen Kaur ◽

Ritu Chauhan ◽

M. Alam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Discriminant Analysis ◽

Medical Data ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Medical Databases ◽

Active Research ◽

Potential Improvement ◽

Large Effort

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.

Download Full-text

Learning Coefficient of Vandermonde Matrix-Type Singularities in Model Selection

Entropy ◽

10.3390/e21060561 ◽

2019 ◽

Vol 21 (6) ◽

pp. 561

Author(s):

Miki Aoyagi

Keyword(s):

Model Selection ◽

Information Criteria ◽

Learning Systems ◽

Vandermonde Matrix ◽

Selection Methods ◽

Learning Models ◽

Matrix Type ◽

Log Canonical Threshold ◽

Log Canonical ◽

Blowing Up

In recent years, selecting appropriate learning models has become more important with the increased need to analyze learning systems, and many model selection methods have been developed. The learning coefficient in Bayesian estimation, which serves to measure the learning efficiency in singular learning models, has an important role in several information criteria. The learning coefficient in regular models is known as the dimension of the parameter space over two, while that in singular models is smaller and varies in learning models. The learning coefficient is known mathematically as the log canonical threshold. In this paper, we provide a new rational blowing-up method for obtaining these coefficients. In the application to Vandermonde matrix-type singularities, we show the efficiency of such methods.

Download Full-text